Fine-grained parallelization of similarity search between protein sequences
نویسندگان
چکیده
This report presents the implementation of a protein sequence comparison algorithm specifically designed for speeding up time consuming part on parallel hardware such as SSE instructions, multicore architectures or graphic boards. Three programs have been developed: PLAST-P, TPLAST-N and PLAST-X. They provide equivalent results compared to the NCBI BLAST family programs (BLAST-P, TBLAST-N and BLAST-X) with a speed-up factor ranging from 5 to 10. Key-words: Parallelization, similarity search, indexing, BLAST, GPU, SIMD Parallélisation à grain fin de la recherche de similarités entre séquences protéiques Résumé : Ce rapporte présente l’implémentation d’un agorithme de comparaison de séquences protéiques spécialement conçu pour que les parties les plus coûteuses en temps de calcul puissent s’exécuter en parallèle sur des architectures à jeu d’instruction SSE, des architectures multi-coeurs, ou des cartes graphiques de dernières générations. Trois programmes ont été développés : PLAST-P, TPLAST-N et PLAST-X. Ils génèrent des résultats équivalents aux programmes de la famille BLAST (BLAST-P, TBLAST-P et BLAST-X) développés au NCBI. Les facteurs d’accélération (par rapport à BLAST) s’échelonnent de 5 à 10. Mots-clés : Parallélisation, recherche de similarités, indexation, BLAST, GPU, SIMD Fine-grained parallelization 3
منابع مشابه
Compiler Parallelization Techniques for Tiled Multicore Processors
Recently, tiled multicore processors have been proposed as a solution to provide both performance and scalability. Unlike conventional multicore processors, tiled microprocessors provide on-chip networks to exploit fine-grained parallelism. However, the performance of tiled microprocessors largely depends on compilers because of their relatively simple hardware; exploitation of parallelism, com...
متن کاملFine-grained parallelization of lattice QCD kernel routine on GPUs
Simulation time for the classical problem of Lattice Quantum Chromodynamics (Lattice QCD) is dominated by one kernel routine responsible for computing the actions of a Dirac operator. This paper describes an experience in parallelizing this kernel routine. We explore parallelization granularities for this kernel routine on Graphical Processing Units (GPUs). We show that fine-grained parallelism...
متن کاملA parallel version of the D-Ant algorithm for the Vehicle Routing Problem
In this paper we study a parallel implementation of the D-Ant algorithm developed by Reimann, Doerner and Hartl [9] for solving the Vehicle Routing Problem. The main idea in this algorithm is to speed up the search by letting the ants solve only sub-problems rather than the whole problem. This algorithm is well suited for parallelization. We propose a mixed parallelization strategy which combin...
متن کاملA generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences
The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...
متن کاملLarge-scale phylogenetic analysis on current HPC architectures
Phylogenetic inference is considered a grand challenge in Bioinformatics due to its immense computational requirements. The increasing popularity and availability of large multi-gene alignments as well as comprehensive datasets of single nucleotide polymorphisms (SNPs) in current biological studies, coupled with rapid accumulation of sequence data in general, pose new challenges for high perfor...
متن کامل